The R package SemiPar contains the milan.mort dataset on short-term effect of air pollution on mortality. The data comprise 3,652 observations on 9 variables, whose description can be found in the help file. The data are also analysed in the book by Ruppert, Wand and Carroll (2003). The original reference is:
Vigotti, M.A., Rossi, G., Bisanti, L., Zanobetti, A. and Schwartz, J. (1996). Short term effect of urban air pollution on respiratory health in Milan, Italy, 1980-1989. Journal of Epidemiology and Community Health, 50, 71-75.
After performing some explorative analyses:
Taking total.mort (or a suitable transformation of it) as a normally distributed response variable, build a model for the average number of deaths, checking if some of the covariates may have a nonlinear effect (do not consider the resp.mort variable). Follow a Bayesian approach for the task and check the model fit via pp checks.
First, I create a new column called avg.mort, which will be the response variable. It is simply the average number of deaths in the last 3 days, computed for each observation (The first two are values just discarded).
# data summary
summary(milan.mort)
## day.num day.of.week holiday mean.temp
## Min. : 1.0 Min. :1 Min. :0.00000 Min. :-6.10
## 1st Qu.: 913.8 1st Qu.:2 1st Qu.:0.00000 1st Qu.: 7.10
## Median :1826.5 Median :4 Median :0.00000 Median :13.70
## Mean :1826.5 Mean :4 Mean :0.02793 Mean :14.04
## 3rd Qu.:2739.2 3rd Qu.:6 3rd Qu.:0.00000 3rd Qu.:21.40
## Max. :3652.0 Max. :7 Max. :1.00000 Max. :31.50
## rel.humid tot.mort resp.mort SO2
## Min. : 0.00 Min. :10.00 Min. : 0.000 Min. :-17.57
## 1st Qu.:50.30 1st Qu.:26.00 1st Qu.: 1.000 1st Qu.: 32.25
## Median :62.30 Median :31.00 Median : 2.000 Median : 64.77
## Mean :62.01 Mean :31.86 Mean : 2.202 Mean :114.74
## 3rd Qu.:74.70 3rd Qu.:37.00 3rd Qu.: 3.000 3rd Qu.:157.75
## Max. :99.70 Max. :66.00 Max. :12.000 Max. :827.75
## TSP
## Min. : 3.50
## 1st Qu.: 83.61
## Median :117.09
## Mean :136.49
## 3rd Qu.:170.50
## Max. :529.50
# tot.mort histogram
#ggplot(milan.mort, aes(x = tot.mort)) +
# geom_bar()
# tot.mort histogram by week day
ggplot(milan.mort, aes(x=tot.mort)) +
geom_bar(aes(fill=factor(day.of.week)), width = 0.5)
# avg number of deaths
#milan.mort %>% summarize(mean(tot.mort))
milan.mort <- milan.mort %>%
mutate(avg.mort = rollmean(x = tot.mort, 3, align = "right", fill = NA))
old.milan.mort <- milan.mort
milan.mort <- milan.mort[3:nrow(milan.mort),]
head(milan.mort, 10)
## day.num day.of.week holiday mean.temp rel.humid tot.mort resp.mort
## 3 3 4 0 4.6 29.7 37 0
## 4 4 5 0 2.9 32.7 33 1
## 5 5 6 0 2.2 71.3 36 1
## 6 6 7 0 0.7 80.7 45 6
## 7 7 1 0 -0.6 82.0 46 2
## 8 8 2 0 -0.5 82.7 38 4
## 9 9 3 0 0.2 79.3 29 1
## 10 10 4 0 1.7 69.3 39 4
## 11 11 5 0 2.7 49.7 36 2
## 12 12 6 0 0.8 62.3 35 4
## SO2 TSP avg.mort
## 3 276.25 162.16 38.00000
## 4 440.50 197.52 34.00000
## 5 354.25 234.59 35.33333
## 6 334.50 167.34 38.00000
## 7 619.50 216.49 42.33333
## 8 560.61 236.10 43.00000
## 9 535.09 251.85 37.66667
## 10 524.92 269.95 35.33333
## 11 512.25 278.65 34.66667
## 12 474.75 286.65 36.66667
# pairplots for numerical variables
pairs(avg.mort ~ mean.temp + rel.humid + SO2 + TSP, data=milan.mort)
ggplot(milan.mort, aes(x = TSP, y = avg.mort)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
ggplot(milan.mort, aes(x = SO2, y = avg.mort)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
Motivated by the above relationships I start with the predictors \(SO2\), measuring sulphur dioxide level in the air, and \(TSP\), the total suspended particles:
\[ \text{avg_mort}_i \sim \mathcal{N}(\mu,\sigma^2)\\ \mu = \alpha + \beta\; \text{SO2} + \gamma \; TSP \] which has \(\mathcal{N}(0,30)\) prior distributions on \(\alpha,\beta,\gamma\) and \(Cauchy(0,30)\) on \(\sigma\).
## arrange data into a list
data <- list(
N = nrow(milan.mort),
avg_mort = milan.mort$avg.mort,
SO2 = milan.mort$SO2,
TSP = milan.mort$TSP,
mean_temp = milan.mort$mean.temp
)
## compile the model
model1 <- stan_model("model1.stan", auto_write = TRUE)
## recompiling to avoid crashing R session
## fit the model
model1_fit <- sampling(model1, data = data, cores=4)
print(model1_fit, pars = c('alpha','beta','gamma','sigma'))
## Inference for Stan model: model1.
## 4 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=4000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## alpha 29.28 0 0.18 28.94 29.15 29.28 29.40 29.64 1492 1
## beta 0.03 0 0.00 0.03 0.03 0.03 0.03 0.03 3493 1
## gamma -0.01 0 0.00 -0.01 -0.01 -0.01 -0.01 0.00 1941 1
## sigma 5.18 0 0.06 5.06 5.14 5.18 5.22 5.30 1887 1
##
## Samples were drawn using NUTS(diag_e) at Sun Jun 16 15:14:00 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
# for model comparison
loo_1 <- loo(extract_log_lik(model1_fit))
## Warning: Relative effective sample sizes ('r_eff' argument) not specified.
## For models fit with MCMC, the reported PSIS effective sample sizes and
## MCSE estimates will be over-optimistic.
print(loo_1)
##
## Computed from 4000 by 3650 log-likelihood matrix
##
## Estimate SE
## elpd_loo -11181.2 52.0
## p_loo 4.6 0.3
## looic 22362.5 104.1
## ------
## Monte Carlo SE of elpd_loo is 0.0.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
## traceplots
plot(model1_fit, plotfun = "trace", pars = c('alpha','beta','gamma','sigma')) + ggtitle("traceplots")
## acf plots
mcmc_acf(as.matrix(model1_fit, pars=c("alpha","beta","gamma","sigma")))
# for model comparison
log_lik_1 <- extract_log_lik(model1_fit)
loo_1 <- loo(log_lik_1)
waic_1 <- waic(log_lik_1)
Rhat value equal to one indicates the convergence of the chains, which is confirmed by the traceplots mixing well. Also, the autocorrelation decreases for all parameters.
## parameters distributions
mcmc_hist(as.matrix(model1_fit, pars = c('alpha','beta','gamma','sigma')))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
# relationship bw estimated parameters
mcmc_scatter(as.matrix(model1_fit, pars = c('alpha','beta')), alpha = 0.2)
mcmc_scatter(as.matrix(model1_fit, pars = c('beta','gamma')), alpha = 0.2)
mcmc_scatter(as.matrix(model1_fit, pars = c('alpha','sigma')), alpha = 0.2)
The scatterplot shows a relationship between variables beta and gamma.
## posterior predictive check
y = milan.mort$avg.mort
y_rep <- as.matrix(model1_fit, pars = "y_rep")
ppc_dens_overlay(y = milan.mort$avg.mort, y_rep[1:200,])
## standardised residuals of the observed vs predicted number of deaths
mean_y_rep <- colMeans(y_rep)
std_resid <- (milan.mort$tot.mort - mean_y_rep) / sqrt(mean_y_rep)
qplot(mean_y_rep, std_resid) + hline_at(0, color="gray50")
# real mean vs estimated mean
ppc_stat(y = milan.mort$avg.mort, yrep = y_rep, stat = "mean")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
#print(c(mean(y),mean(y_rep)))
In the posterior predictive check, the simulated distributions resemble the original one.
The residuals look mostly positive, this means that the model understimates the real outcomes.
#ggplot(milan.mort, aes(x = rel.humid, y = tot.mort)) +
# geom_point() +
# geom_smooth(method = "lm", se = FALSE)
## mean.temp vs tot.mort
ggplot(milan.mort, aes(x = mean.temp, y = milan.mort$avg.mort)) +
geom_point() +
geom_smooth(method = "lm", se = FALSE)
The plot shows an evident relationship between the average number of deaths and the mean daily temperature, so I include the information about temperature in the model \[ \text{avg_mort} \sim \mathcal{N}(\mu,\sigma^2)\\ \mu = \alpha + \beta\; \text{SO2} + \gamma \; TSP + \delta \; \text{mean_temp}_i \] I’m using the same priors from the first model, and a \(\mathcal{N}(0,30)\) on \(\delta\).
## compile the model
model2 <- stan_model("model2.stan", auto_write = TRUE)
## recompiling to avoid crashing R session
## fit the model
model2_fit <- sampling(model2, data = data, cores = 4)
print(model2_fit, pars = c('alpha','beta','sigma'))
## Inference for Stan model: model2.
## 4 chains, each with iter=2000; warmup=1000; thin=1;
## post-warmup draws per chain=1000, total post-warmup draws=4000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## alpha 34.40 0.01 0.31 33.77 34.20 34.40 34.61 35.02 1390 1
## beta 0.02 0.00 0.00 0.02 0.02 0.02 0.02 0.02 2714 1
## sigma 4.93 0.00 0.06 4.82 4.89 4.92 4.96 5.03 1661 1
##
## Samples were drawn using NUTS(diag_e) at Sun Jun 16 15:20:19 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
# for model comparison
log_lik_2 <- extract_log_lik(model2_fit)
loo_2 <- loo(log_lik_2)
## Warning: Relative effective sample sizes ('r_eff' argument) not specified.
## For models fit with MCMC, the reported PSIS effective sample sizes and
## MCSE estimates will be over-optimistic.
waic_2 <- waic(log_lik_2)
# parameters distributions
mcmc_hist(as.matrix(model2_fit, pars = c('alpha','beta','gamma','sigma')))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## posterior predictive checking
y_rep <- as.matrix(model2_fit, pars = "y_rep")
ppc_dens_overlay(y = y, y_rep[1:200,])
# real mean vs estimated mean
ppc_stat(y = milan.mort$avg.mort, yrep = y_rep, stat = "mean")
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
print(c(mean(milan.mort$avg.mort), mean(y_rep)))
## [1] 31.85443 31.85489
Even if the chains are converging correctly and the autocorrelation is decreasing, the predictions on the mean value are not plausible over the different days of the week.
ppc_stat_grouped(
y = milan.mort$tot.mort,
yrep = y_rep,
group = milan.mort$day.of.week,
stat = 'mean',
binwidth = 0.2
)
#ppc_intervals(
# y = y,
# yrep = y_rep,
# x = milan.mort$day.of.week
#) +
# labs(x = "day of week", y = "avg deaths")
We could take into account the hierarchical structure of the data in week days and try modelling the variation across them. In this case the response variable is tot.mort, because it would make no sense to model daily variations on averages calculated along the last three days. Both the intercept and the standard deviation are estimated for each different day of the week, moreover this time the priors are \(\mathcal{N}(0,10)\), since I suppose a lower deviation from the mean in grouped data.
\[ \text{avg_mort}_{id} \sim \mathcal{N}(\mu_{d},\sigma_d)\\ \mu_{d} = \alpha_d + \beta\; \text{SO2} + \gamma \; TSP + \delta \; \text{mean_temp} \]
# reload data
#data(milan.mort)
milan.mort <- old.milan.mort
# weekday data
weekday_data <- milan.mort %>%
select(day.of.week, everything()) %>%
select(-c(rel.humid,resp.mort)) %>%
unique() %>%
arrange(day.of.week) %>%
select(-day.of.week) %>%
as.data.frame()
str(weekday_data)
## 'data.frame': 3652 obs. of 7 variables:
## $ day.num : int 7 14 21 28 35 42 49 56 63 70 ...
## $ holiday : int 0 0 0 0 0 0 0 0 0 0 ...
## $ mean.temp: num -0.6 0.6 5 4.7 9.2 5.5 7.5 8.6 11.8 12.4 ...
## $ tot.mort : num 46 48 51 37 35 43 58 36 49 44 ...
## $ SO2 : num 620 734 375 499 445 ...
## $ TSP : num 216 294 334 270 231 ...
## $ avg.mort : num 42.3 39 46.3 40.3 38 ...
# hierarchical data
hier_data <- list(
N = nrow(milan.mort),
D = length(unique(milan.mort$day.of.week)),
tot_mort = milan.mort$tot.mort,
SO2 = milan.mort$SO2,
TSP = milan.mort$TSP,
mean_temp = milan.mort$mean.temp,
day_of_week = milan.mort$day.of.week,
weekday_data = weekday_data,
holiday = milan.mort$holiday
)
str(hier_data)
## List of 9
## $ N : int 3652
## $ D : int 7
## $ tot_mort : num [1:3652] 45 32 37 33 36 45 46 38 29 39 ...
## $ SO2 : num [1:3652] 267 375 276 440 354 ...
## $ TSP : num [1:3652] 110 153 162 198 235 ...
## $ mean_temp : num [1:3652] 5.6 4.1 4.6 2.9 2.2 0.7 -0.6 -0.5 0.2 1.7 ...
## $ day_of_week : int [1:3652] 2 3 4 5 6 7 1 2 3 4 ...
## $ weekday_data:'data.frame': 3652 obs. of 7 variables:
## ..$ day.num : int [1:3652] 7 14 21 28 35 42 49 56 63 70 ...
## ..$ holiday : int [1:3652] 0 0 0 0 0 0 0 0 0 0 ...
## ..$ mean.temp: num [1:3652] -0.6 0.6 5 4.7 9.2 5.5 7.5 8.6 11.8 12.4 ...
## ..$ tot.mort : num [1:3652] 46 48 51 37 35 43 58 36 49 44 ...
## ..$ SO2 : num [1:3652] 620 734 375 499 445 ...
## ..$ TSP : num [1:3652] 216 294 334 270 231 ...
## ..$ avg.mort : num [1:3652] 42.3 39 46.3 40.3 38 ...
## $ holiday : int [1:3652] 1 0 0 0 0 0 0 0 0 0 ...
## compile the model
model3 <- stan_model("model3.stan", "auto_write" = TRUE)
## fit the model
model3_fit <- sampling(model3, data = hier_data,
iter = 1000, chains = 4)
##
## SAMPLING FOR MODEL 'model3' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000516 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 5.16 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 1: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 1: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 1: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 1: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 1: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 1: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 1: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 1: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 1: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 1: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 1: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 25.5616 seconds (Warm-up)
## Chain 1: 8.20241 seconds (Sampling)
## Chain 1: 33.764 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'model3' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0.000445 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 4.45 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 2: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 2: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 2: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 2: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 2: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 2: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 2: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 2: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 2: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 2: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 2: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 19.3779 seconds (Warm-up)
## Chain 2: 10.704 seconds (Sampling)
## Chain 2: 30.0818 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'model3' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 0.000531 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 5.31 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 3: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 3: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 3: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 3: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 3: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 3: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 3: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 3: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 3: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 3: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 3: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 23.5329 seconds (Warm-up)
## Chain 3: 10.8094 seconds (Sampling)
## Chain 3: 34.3423 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'model3' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0.000454 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 4.54 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 1000 [ 0%] (Warmup)
## Chain 4: Iteration: 100 / 1000 [ 10%] (Warmup)
## Chain 4: Iteration: 200 / 1000 [ 20%] (Warmup)
## Chain 4: Iteration: 300 / 1000 [ 30%] (Warmup)
## Chain 4: Iteration: 400 / 1000 [ 40%] (Warmup)
## Chain 4: Iteration: 500 / 1000 [ 50%] (Warmup)
## Chain 4: Iteration: 501 / 1000 [ 50%] (Sampling)
## Chain 4: Iteration: 600 / 1000 [ 60%] (Sampling)
## Chain 4: Iteration: 700 / 1000 [ 70%] (Sampling)
## Chain 4: Iteration: 800 / 1000 [ 80%] (Sampling)
## Chain 4: Iteration: 900 / 1000 [ 90%] (Sampling)
## Chain 4: Iteration: 1000 / 1000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 27.918 seconds (Warm-up)
## Chain 4: 8.75646 seconds (Sampling)
## Chain 4: 36.6744 seconds (Total)
## Chain 4:
print(model3_fit, pars = c('alpha','beta','gamma','delta','sigma'))
## Inference for Stan model: model3.
## 4 chains, each with iter=1000; warmup=500; thin=1;
## post-warmup draws per chain=500, total post-warmup draws=2000.
##
## mean se_mean sd 2.5% 25% 50% 75% 97.5% n_eff Rhat
## alpha[1] 33.72 0.02 0.52 32.65 33.38 33.71 34.07 34.74 558 1.00
## alpha[2] 33.87 0.02 0.49 32.87 33.53 33.88 34.19 34.82 563 1.00
## alpha[3] 33.29 0.02 0.51 32.28 32.96 33.29 33.62 34.32 505 1.01
## alpha[4] 33.57 0.02 0.51 32.55 33.25 33.57 33.92 34.56 544 1.00
## alpha[5] 33.26 0.02 0.50 32.28 32.94 33.27 33.59 34.27 529 1.00
## alpha[6] 33.05 0.02 0.49 32.09 32.72 33.05 33.38 34.03 544 1.00
## alpha[7] 33.05 0.02 0.47 32.17 32.74 33.06 33.37 33.97 551 1.00
## beta 0.02 0.00 0.00 0.02 0.02 0.02 0.02 0.02 1305 1.00
## gamma -0.01 0.00 0.00 -0.01 -0.01 -0.01 0.00 0.00 1778 1.00
## delta -0.23 0.00 0.02 -0.26 -0.24 -0.23 -0.21 -0.19 501 1.00
## sigma[1] 7.13 0.01 0.23 6.72 6.98 7.12 7.28 7.58 1917 1.00
## sigma[2] 6.72 0.00 0.20 6.33 6.58 6.71 6.85 7.14 1889 1.00
## sigma[3] 6.51 0.00 0.19 6.16 6.38 6.50 6.63 6.90 2048 1.00
## sigma[4] 6.85 0.00 0.21 6.46 6.71 6.84 6.99 7.29 1873 1.00
## sigma[5] 6.98 0.01 0.21 6.58 6.84 6.98 7.12 7.43 1651 1.00
## sigma[6] 6.58 0.01 0.21 6.20 6.44 6.58 6.72 7.01 1343 1.00
## sigma[7] 6.68 0.01 0.20 6.29 6.54 6.66 6.82 7.10 1470 1.00
##
## Samples were drawn using NUTS(diag_e) at Sun Jun 16 15:43:31 2019.
## For each parameter, n_eff is a crude measure of effective sample size,
## and Rhat is the potential scale reduction factor on split chains (at
## convergence, Rhat=1).
#computing psis-looic
log_lik_3 <- extract_log_lik(model3_fit)
loo_3 <- loo(log_lik_3)
## Warning: Relative effective sample sizes ('r_eff' argument) not specified.
## For models fit with MCMC, the reported PSIS effective sample sizes and
## MCSE estimates will be over-optimistic.
#waic_3 <- waic(log_lik_3)
# areas plots
mcmc_areas(as.matrix(model3_fit, pars = c('alpha')))
mcmc_areas(as.matrix(model3_fit, pars = c('beta','gamma')))
mcmc_areas(as.matrix(model3_fit, pars = c('delta')))
mcmc_areas(as.matrix(model3_fit, pars = c('sigma')))
## traceplots
plot(model3_fit, plotfun = "trace", pars = c('alpha','beta','gamma','sigma')) + ggtitle("traceplots")
## acf plots
mcmc_acf(as.matrix(model3_fit, pars=c("alpha","beta","gamma","sigma")))
## parameters distributions
mcmc_hist(as.matrix(model3_fit, pars = c('alpha','beta','gamma','sigma')))
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
## posterior predictive check
y_rep <- as.matrix(model3_fit, pars = "y_rep")
ppc_dens_overlay(y = milan.mort$tot.mort, y_rep[1:200,])
# real mean vs estimated mean
ppc_stat_grouped(
y = milan.mort$tot.mort,
yrep = y_rep,
group = milan.mort$day.of.week,
stat = 'mean',
binwidth = 0.2
)
Now the model is able to correctly describe the average number of deaths among different days of the week.
Model the nonlinear effects of some covariates.
# pairplots for numerical variables
pairs(tot.mort ~ mean.temp + rel.humid + SO2 + TSP, data=milan.mort)
ggplot(data = milan.mort, aes(SO2,mean.temp)) + geom_point()
Now consider a GLM with a Poisson distributed response for total.mort, comparing the fitted response values with those obtained previously.
fit_glm <- stan_glm (
formula=tot.mort ~ SO2 + TSP + mean.temp,
data=milan.mort,
family=poisson(),
prior=normal(0,10),
prior_intercept = normal(0,10))
##
## SAMPLING FOR MODEL 'count' NOW (CHAIN 1).
## Chain 1:
## Chain 1: Gradient evaluation took 0.000665 seconds
## Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 6.65 seconds.
## Chain 1: Adjust your expectations accordingly!
## Chain 1:
## Chain 1:
## Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 1:
## Chain 1: Elapsed Time: 11.7738 seconds (Warm-up)
## Chain 1: 29.0411 seconds (Sampling)
## Chain 1: 40.8148 seconds (Total)
## Chain 1:
##
## SAMPLING FOR MODEL 'count' NOW (CHAIN 2).
## Chain 2:
## Chain 2: Gradient evaluation took 0.000602 seconds
## Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 6.02 seconds.
## Chain 2: Adjust your expectations accordingly!
## Chain 2:
## Chain 2:
## Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 2:
## Chain 2: Elapsed Time: 9.9854 seconds (Warm-up)
## Chain 2: 12.8867 seconds (Sampling)
## Chain 2: 22.8721 seconds (Total)
## Chain 2:
##
## SAMPLING FOR MODEL 'count' NOW (CHAIN 3).
## Chain 3:
## Chain 3: Gradient evaluation took 0.000647 seconds
## Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 6.47 seconds.
## Chain 3: Adjust your expectations accordingly!
## Chain 3:
## Chain 3:
## Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 3:
## Chain 3: Elapsed Time: 12.5336 seconds (Warm-up)
## Chain 3: 15.3393 seconds (Sampling)
## Chain 3: 27.8728 seconds (Total)
## Chain 3:
##
## SAMPLING FOR MODEL 'count' NOW (CHAIN 4).
## Chain 4:
## Chain 4: Gradient evaluation took 0.000614 seconds
## Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 6.14 seconds.
## Chain 4: Adjust your expectations accordingly!
## Chain 4:
## Chain 4:
## Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
## Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
## Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
## Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
## Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
## Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
## Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
## Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
## Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
## Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
## Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
## Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
## Chain 4:
## Chain 4: Elapsed Time: 21.0345 seconds (Warm-up)
## Chain 4: 13.9177 seconds (Sampling)
## Chain 4: 34.9522 seconds (Total)
## Chain 4:
summary(fit_glm)
##
## Model Info:
##
## function: stan_glm
## family: poisson [log]
## formula: tot.mort ~ SO2 + TSP + mean.temp
## algorithm: sampling
## priors: see help('prior_summary')
## sample: 4000 (posterior sample size)
## observations: 3652
## predictors: 4
##
## Estimates:
## mean sd 2.5% 25% 50% 75%
## (Intercept) 3.5 0.0 3.5 3.5 3.5 3.5
## SO2 0.0 0.0 0.0 0.0 0.0 0.0
## TSP 0.0 0.0 0.0 0.0 0.0 0.0
## mean.temp 0.0 0.0 0.0 0.0 0.0 0.0
## mean_PPD 31.9 0.1 31.6 31.8 31.9 31.9
## log-posterior -12291.8 1.4 -12295.4 -12292.5 -12291.5 -12290.8
## 97.5%
## (Intercept) 3.6
## SO2 0.0
## TSP 0.0
## mean.temp 0.0
## mean_PPD 32.1
## log-posterior -12290.1
##
## Diagnostics:
## mcse Rhat n_eff
## (Intercept) 0.0 1.0 2441
## SO2 0.0 1.0 2124
## TSP 0.0 1.0 2224
## mean.temp 0.0 1.0 2853
## mean_PPD 0.0 1.0 1840
## log-posterior 0.0 1.0 1511
##
## For each parameter, mcse is Monte Carlo standard error, n_eff is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence Rhat=1).
print(fit_glm$coefficients)
## (Intercept) SO2 TSP mean.temp
## 3.5312595269 0.0005397528 -0.0001734769 -0.0081485096
# The canonical link function for Poisson model is the logarithm.
# traceplots
plot(fit_glm, plotfun = "trace") + ggtitle("glm traceplots")
# acf plots
mcmc_acf(as.matrix(fit_glm, pars=c("(Intercept)","SO2","TSP","mean.temp")))
# final comparison
print(loo_1)
##
## Computed from 4000 by 3650 log-likelihood matrix
##
## Estimate SE
## elpd_loo -11181.2 52.0
## p_loo 4.6 0.3
## looic 22362.5 104.1
## ------
## Monte Carlo SE of elpd_loo is 0.0.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
print(loo_2)
##
## Computed from 4000 by 3650 log-likelihood matrix
##
## Estimate SE
## elpd_loo -11001.6 58.9
## p_loo 5.9 0.5
## looic 22003.1 117.9
## ------
## Monte Carlo SE of elpd_loo is 0.0.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
print(loo_3)
##
## Computed from 2000 by 3652 log-likelihood matrix
##
## Estimate SE
## elpd_loo -12178.1 54.6
## p_loo 20.9 1.9
## looic 24356.2 109.1
## ------
## Monte Carlo SE of elpd_loo is 0.1.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
print(loo(fit_glm_informative))
##
## Computed from 4000 by 3652 log-likelihood matrix
##
## Estimate SE
## elpd_loo -12288.7 73.1
## p_loo 5.9 0.3
## looic 24577.5 146.2
## ------
## Monte Carlo SE of elpd_loo is 0.1.
##
## All Pareto k estimates are good (k < 0.5).
## See help('pareto-k-diagnostic') for details.
By comparing the fitted models on the two considered response variables, we can notice that the models with lower loo values are the temperature model on the average number of deaths and the hierarchical model on the total number of deaths.